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INTRODUCTION 


@ Public health is primarily concerned with disease prevention in human 
populations, and epidemiology is the branch of public health that attempts to 
discover the causes of disease to make disease prevention possible. Public 
health investigations use quantitative methods, which combine the two 
disciplines of epidemiology and biostatistics. 


STATISTICS It is a science concerned with: 
@ Collection of data 
@ Presentation of the collected data 
@ Analysis and interpretation of the results. 
@ Making decisions based on such analysis 


BIOSTATISTICS Biostatistics is that branch of statistics that deals primarily with the 
biological sciences and medical/health-related disciplines. 
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Biostatistics 


POPULATION AND SAMPLE 


POPULATION A population is a collection of persons/things or 
characteristics in which we have the interest to 
investigate. 

For Example, the collection of persons living in 

Dubai city who test positive for hepatitis C. 
SAMPLE A sample is a subset of a population. 

Example, we may refer to a sample of 50 men 

over age 65 who suffer from hypertension or to a 

sample of 50 blood pressures taken on 50 men 

over age 65 who suffer from hypertension. 
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Biostatistics 


POPULATION AND SAMPLE 


A sample should be a representative part of the pop- 
ulation. 


Population 


Sample 


-~ 


Sy 
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Biostatistics 


VARIABLE 


VARIABLE A Characteristic that takes on more than one value 
is termed a variable. A variable may also be called 
a data item. 


EXAMPLES OF VARIABLES Age, gender, business income and 
expenses, country of birth, capital expenditure, 
class grades, eye color, and vehicle type are 
examples of variables. If a sample consists of 50 
males, then gender is not a variable in this sample 
but is termed a constant. If the sample is made up 
of males and females, then gender is a variable in 
this sample. 
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Biostatistics 


INDEPENDENT AND DEPENDENT VARIABLES 


INDEPENDENT (EXPLANATORY) VARIABLE is the cause. Its 
value is independent of other variables in your 
study. 

DEPENDENT (RESPONSE) VARIABLE is the effect. Its value 
depends on changes in the independent variable. 


Independent Variables Dependent Variables 


Experiment: Are test scores impacted 
by the amount of time spent sleeping 
the night before a test? 


wut 
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Exercise 


FIND OUT INDEPENDENT AND DEPENDENT VARIABLES? 


EXAMPLE | Evidence is lacking on the impact of smoking on 
colorectal cancer (CRC) risk (overall and by age 
at diagnosis) by polygenic risk score (PRS) levels, 
and it is unclear how the magnitude of CRC risk 
associated with smoking compares to the 
magnitude of genetically determined risk. 

EXAMPLE 2 HBV infection was more concentrated among 
population with high economic status. 

EXAMPLE 3 Tens of millions of Americans have a chronic 
health condition that increases their risk of 
severe illness from COVID-19. 

EXAMPLE 4 SARS-CoV-2 is not just a respiratory virus that 
affects the lungs. It can also affect the stomach, 
intestines, heart, blood vessels, liver, kidneys, 


mamal eee ee ee. 
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A questionnaire 


COLLECTION FORM 


Module 1: Demographic information of the respondent 
ID | Name (First3 | 1.1 Category of |12Sex | 13Age | 1.4 Marital | 1.5 Highest level of 1.7 Pregnancy | 1.8 Lactation 
letter) respondent status school attended 
1. HH Head 1. Male In years 1. Married 1. Primary (1-5 years) Ask the female | Ask the female 
2. Others 2. Female 2. Unmarried | 2. Secondary high respondents aged | respondents aged 
3. Third 3. Widowed. school (6-10 years) Tess than 45 less than 45 years 
gender 4. Divorced) | 3. Higher secondary | 4. Service holder years - If she - If she was breast 
Separated (11-12 years) 5. Business was ever feeding her 
4. University or 6. Day laborer pregnant in last | child/children any 
higher (>12 years) | 7. Professional one year or time during the 
5. Madrasa (Physician/lawyer/teacher) | currently last one year 
6. No schooling 8. Productive work at HH pregnant for at 
9. Driver least 3 months 
10. Student 
11. Housewife 1. Yes 1, Yes 
12. Beggar 2. No 2, No 
13. Unemployed 
14. Others... 
OL 
02 
03 
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Biostatistics 


A SNAPSHOT OF A DATA TABLE 


ONSET INFECT 
12-OCT-07 
30-MAY-05 
11-NOV-06 


e Each row corresponds to an observation 


* Each column contains information on a variable 
e Each cell in the table contains a value 
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Biostatistics 


DATA TYPES/ MEASUREMENT SCALES 


Categorical Quantitative 


binary nominal ordinal discrete continuous 


2 categories 


more categories 


order matters 
numerical 


uninterrupted 
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Biostatistics 


DATA TYPES/ MEASUREMENT SCALES 


NOMINAL It measures values that fall into categories with no natural numerical 
value. Nominal data is often coded numerically, but the codes are just 
alternate names. For Example, 0 for females and 1 for males. 

ORDINAL falls into categories that can be qualitatively ordered but have no 
intrinsic numerical value. Ordinal data can be ‘ranked.’ E.g., 
education. 

DISCRETE DATA measured quantities that take on specific values, usually integers. 
E.g., the number of traffic accidents, infant deaths, etc. 

CONTINUOUS DATA measured quantities not restricted to specific values. E.g., birth 
weight, blood pressure, etc. 
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Exercise 


FIND OUT NOMINAL, ORDINAL OR QUANTITATIVE VARIABLES? 


EXAMPLE 1 Age 

EXAMPLE 2 Age groups 

EXAMPLE 3 Sleeping hours 

EXAMPLE 4 blood pressure (SBP and DBP) 
EXAMPLE 5 Presence of hypertension. 
EXAMPLE 6 Blood type. 
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CENTRAL DOGMA OF STATISTICS 
PARAMETER Any summarization of the elements of a population. E.g., the average 


of the blood pressures that make up a population. 


Statistic Any summarization of the elements of a sample. E.g., the average of 
the blood pressures that make up a sample. 


Statistics 


Descriptive 


Inferential 
Statistics Statistics 


Presenting, organizing 


Drawing conclusions 
and summarizing data 


about a population based 
on data observed in a 
sample 


Oo or = 
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DESCRIPTIVE STATISTICS: EXPLORATORY DATA ANALYSIS 


ALWAYS look at your data! 
If you can’t see it, don’t believe it! 


EDA allows us to: 


@ Visualize distributions and relationships 
@ Detect errors 
@ Assess assumptions for confirmatory analysis 


EDA is the first step of data analysis 
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